A New Approach to Bangla Text Extraction and Recognition From Textual Image
نویسندگان
چکیده
This paper presents a new approach to segment and recognize Printed Bangla Text using Characteristic functions and Hamming network. The main difficulties in printed Bangla text recognition are the separation of lines, words and individual characters. In this paper, a new algorithm has been proposed to detect and separate text lines, words and characters from printed Bangla text. The algorithm uses a set of characteristic functions for segmenting upper portion of some characters and characters that come under the Base line. It also uses a combination of Flood-fill and Boundary-fill algorithm for segmenting some characters that cannot be segmented using traditional approach. Hamming network is used for recognition scheme. Recognition is done for both isolated and continuous size independent printed characters.
منابع مشابه
Bangla Text Recognition from Video Sequence: A New Focus
extraction and recognition of Bangla text from video frame images is challenging due to complex color background, low-resolution etc. In this paper, we propose an algorithm for extraction and recognition of Bangla text form such video frames with complex background. Here, a two-step approach has been proposed. First, the text line is segmented into words using information based on line contours...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملروش جدید متنکاوی برای استخراج اطلاعات زمینه کاربر بهمنظور بهبود رتبهبندی نتایج موتور جستجو
Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...
متن کاملWord Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images
In this paper, a novel approach for word extraction and character segmentation from the handwritten Bangla document images is reported. At first, a modified Run Length Smoothing Algorithm (RLSA), called Spiral Run Length Smearing Algorithm (SRLSA), is applied for the extraction of words from the text lines of unconstrained handwritten Bangla document images. This technique has helped to overcom...
متن کاملBangla Basic Character Recognition Using Digital Curvelet Transform
This paper addresses the problem of Bangla basic character recognition. Multi-font Bangla character recognition has not been attempted previously. Twenty popular Bangla fonts have been used for the purpose of character recognition. A novel feature extraction scheme based on the digital curvelet transform is proposed. The curvelet transform, although heavily utilized in various areas of image pr...
متن کامل